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ABSTRACT 

We have created mock Sunyaev-Zel'dovich effect (SZE) surveys of galaxy clusters using high resolution 
N-body simulations. To the pure surveys we add 'noise' contributions appropriate to instrument and 
primary CMB anisotropies. Applying various cluster finding strategies to these mock surveys we generate 
catalogues which can be compared to the known positions and masses of the clusters in the simulations. 
We thus show that the completeness and efficiency that can be achieved depend strongly on the frequency 
coverage, noise and beam characteristics of the instruments, as well as on the candidate threshold. We 
study the effects of matched filtering techniques on completeness, and bias. We suggest a gentler filtering 
method than matched filtering in single frequency analyses. We summarize the complications that arise 
when analyzing the SZE signal at a single frequency, and assess the limitations of such an analysis. Our 
results suggest that some sophistication is required when searching for 'clusters' within an SZE map. 

Subject headings: Galaxies-clusters, cosmology-theory 



1. INTRODUCTION 

Observations of the number density of clusters of galax- 
ies will play an increasingly important role in determining 
the composition of the energy density in the universe as 
data from the myriad of upcoming cluster surveys accu- 
mulates. Cluster surveys result in constraints orthogonal 
in parameter space to those obtained from other cosmo- 
logical observations, such as the Cosmic Microwave Back- 
ground (CMB) anisotropies and supernova searches, be- 
cause the cluster abundance depends significantly on the 
linear growth function. For this reason, clusters can also be 
used to probe the nature and evolution of the dark energy 
in the universe. Since clusters are the most recently formed 
gravitationally bound objects in the universe, the evolu- 
tion of their number density sensitively probes the critical 
redshift range < z < 2, a range over over which the 
dark energy has come to dominate the total energy den- 
sity. Clusters are convenient in that they are very bright, 
and rare enough to make counting them tractable. 

A wide array of survey techniques is being used to con- 
duct searches for clusters, making use of optical and X-ray 
emissions from clusters, weak lensing distortions, and the 
Sunyaev Zel'dovich effect (SZE; see Table Q). The SZE is a 
particularly promising approach for finding galaxy clusters 
because the signal is relatively independent of the cluster's 
distance from us. This implies that, at least in principle, 
the selection function for such surveys is very well known. 
This is crucial if we are to use the cluster catalogues to 
measure the evolution of the number density of clusters. 

Using the SZE does present certain difficulties. The en- 
ergy lost by the CMB photons on their journey from the 
last scattering surface is an integrated effect. Hence the 
SZE signal suffers from projection effects from other ob- 
jects in the same line of sight as the cluster, and also yields 
no information about the redshift of the cluster save its an- 
gular size on the sky. Thus, to study evolution effects in 
the number density, a followup observation is required to 
obtain the redshifts, and in rare cases distinguish between 



two separate clusters that lie in the same line of sight. 
The primary CMB anisotropies are also a problem, having 
significant power on cluster-sized angular scales. Finally, 
since clusters of galaxies are not perfect, isolated spheres 
of gas, the SZE signal obtained from a cluster of a given 
mass will vary considerably depending on the particular 
line of sight through the cluster. These effects make it 
difficult to correlate the SZE signal with the actual mass 
of the cluster causing it. In this paper we make a prelimi- 
nary investigation of the power of SZE surveys in finding 
clusters, taking these effects into account, and present the 
results in terms of the completeness and efficiency of mass 
limited samples achievable using the SZE. 

2. METHOD 

We construct maps of the SZE effect at various frequen- 
cies using as input a high-resolution N-body simulation of 
structure formation in a ACDM cosmology. In this way ou r 
method is similar to that of Kay, Liddle & Thomas (2001). 
We make mock observations of these maps by adding signal 
from the primary CMB anisotropies to the SZE maps, con- 
volving with a beam window function, and adding Gaus- 
sian random noise such as might be produced by the elec- 
tronics in a real observation. We identify every cluster can- 
didate in these mock observations using a specified method 
and check it against the true 3-D positions of the clusters 
in the same simulation. 

We present our results in terms of the completeness and 
efficiency of the method in finding clusters above a mass 
threshold. Completeness is the ratio of the number of clus- 
ters we found using the mock SZE observation to the total 
number of massive clusters in the field of view. Out of 
the total number of cluster candidates that we identify in 
our SZE maps, only some of them will actually be clusters 
with a mass above the threshold of interest. Efficiency will 
measure the ratio of clusters found to the total number of 
candidates, and is a measure of the amount of contamina- 
tion suffered when using the SZE technique. The survey 
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Table 1 

Some upcoming Sunyaev-Zel'dovich experiments. Type indicates the nature of the receivers, HEMTs or 
Bolometers. The frequency is given in GHz. The resolution is an estimate of the beam size, in arcminutes, and 
for the interferometers this estimate is quite approximate. the last 6 experiments intend to undertake blank 
field sz surveys. more information on these experiments can be found at the listed web sites. 



efficiency will be important in planning follow-up obser- 
vations with other instruments. Obviously an SZE survey 
will be useful for many things besides creating an effec- 
tively mass selected sample, but it is such a sample which 
is the easiest to compare with theories of structure forma- 
tion. It has also become commonplace to describe SZE 
surveys as "effectively mass limited", and it is for these 
reasons that we focus on this metric here. 

2.1. The N-body simulation 

The starting point for constructing the maps is an ac- 
curate model of the spatial distribution of mass along the 
past light-cone. We obtain this from an N-body simulation 
of 512 3 particles in a (periodic) cube of side 300/i _1 Mpc 
run w ith a TreePM-SPH code (see the appendix of White 
2002). Since on the scales of relevance to us baryonic 



pressure is sub-dominant, only collisionless dark matter 
is modeled allowing us to achieve a higher dynamic range 
in the simulation. This allows us to simulate a larger vol- 
ume, containing more of the rare rich clusters we are inter- 
ested in, at the expense of an ad hoc (but flexible) treat- 
ment of the baryonic physics. The simulation is started at 
z = 60 and evolved to the present with the full phase space 
distribution dumped every 100/i _1 Mpc between redshifts 
2 > z > 0. It is this range of redshifts which dominates 
the SZE signal on the angular scales of interest to us, but 
by cutting off the integration at z — 2 we will underes- 
timate the effect of confusion in our maps. The gravita- 
tional softening used is of a spline form, with a "Plummer- 
cquivalcnt" (comoving) softening length of 20/i _1 kpc. We 
have used a flat cosmology compatible with a host of cur- 
rent observations; f2 m = 0.3, £!a = 0.7, Q^h 2 = 0.02, 
h = 0.7, n — 1, and ag — 1. The transfer function 
was eval uated with the fitting function of Eisenstein & 
Hu 1999. While a slightly lower erg would better fit the 



inferred mass function of rich clusters from X-ray surveys, 
the higher as provides an easie r mat ch to the CBI deep 
field observations (Mason et al. 2002). The mass resolu- 
tion in the simulation is fine enough to identify galactic 
mass halos, with non-interacting dark matter particles of 
mass 1.7x 10 10 h Mq. All of the relevant cluster-scale ha- 
los contain several thousand particles to begin to resolve 
sub-structure. The simulation was performed on 128 pro- 



cessors of the IBM-SP2 at NERSC, took nearly 4000 time 
steps and approximately 100 wall clock hours to complete. 

To construct the long thin line of sight used to compute 
the net SZE, we have stacked the intermediate stages of 
the simulation between redshifts 2 > z > 0. In order to 
avoid multiply sampling the same large scale structures, 
each 300ft, Mpc box has been randomly re-oriented in 
one of the six possible orientations, and has furthermore 
been shifted by a random amount, perpendicular to the 
line-of-sight, making use of the periodic boundary condi- 
tions. There are three time dumps per box length. Each 
300ft. _1 Mpc volume in the stack is made up of three seg- 
ments, each segment evolved to a later epoch than the pre- 
vious one by the time it takes light to travel 100/i _1 Mpc. 
We have chosen 100/i _1 Mpc as the sampling interval be- 
cause it is large enough that edge effects are minimal, yet 
fine enough that the line of sight integrals are well ap- 
proximated by sums of the (static) outputs. Because of 
the periodicity, we are free to choose any of the thirds 
as the oldest, cyclically permuting the other two. This 
approach preserves the continuity of large-scale structure 
over distances of SOOft-^Mpc without compromising the 
resolution in time evolution. 



2.2. Cluster catalog 

In order to compute the completeness and efficiency with 
which the mock SZE survey can detect clusters, we need 
to know the true distribution of cluster in the simulated 
fields. To this end we construct a catalog of the 3-D posi- 
tion, redshift, mass, velocity dispersion, and other useful 
quantities of each identified halo above lO 13 ^ 1 M Q . Halos 
are id entifi ed using a friends-of-friends algorithm (Davis 
et al. 1985| ) on each of the time dumps used in the line 
of sight integral. The FoF algorithm partitions the parti- 
cles into equivalence classes by linking together all particle 
pairs separated by less than a distance b. We use a linking 
length of b = 0.15 times the mean inter-particle spacing, 
which is smaller than the typical value of b = 0.2 because 
it reduces the number of instances in which two separate 
halos, connected by a filament of significant overdensity, 
get accidentally classified as a single object. Such misclas- 
sifications were found to be a significant source of confu- 
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sion when computing completeness and efficiency (see also 
White & Kochanek 2002, Kochanek et al. 2003, White, van 
Waerbeke & Mackey 2002 for further discussion). While 
the spherical overdensity algorithm, which is less suscep- 
tible to the merging problem, could have been used, this 
algorithm identifies spherical clusters which would intro- 
duce a different type of bias. We use position of the po- 
tential minimum of the FoF group as the center of the 
cluster because it is more robust than using the center of 
mass, coinciding closely with the density maximum of the 
cluster for all but the most anomalous clusters. Center- 
ing on the potential minimum we computed, for each halo, 
the mass, M2oo, enclosed within a radius, r^oo, interior to 
which the density contrast was 200 times the critical den- 
sity. There are 6887 halos in the z = 3D catalogue with 
-M20Q > 10 13 h Mq of which 790 are more massive than 
10 14 h~ 1 M ( r ) and 9 have masses greater than 10 15 /i _1 M . 
A typical simulated field will contain two or three hundred 
halos more massive than 10 14 Ii^Mq. 

2.3. Compton-y maps 

Because the simulation contains no gas we use a semi- 
analytic model to include the gas physics. First we assume 
that the gas closely traces the dark matter. This is likely 
a good approximation in all regions except the innermost 
O(100)kpc of the cluster, which for clusters at cosmologi- 
cal distances will be unresolved by the experiments of in- 
terest. (For an f2 m = 0.3 flat cosmology, lOOkpc subtends 
only 0.26' at z = 0.5 while upcoming survey experiments 
have beams of ~ 1'.) We ignore the presence of cold gas 
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Fig. 1. — One SZE map, smoothed with a Gaussian of 1' FWHM 
but with no noise added. The grey-scale is logarithmic, running 
from 10 -5 ' 5 to 10 — 3 5 . There are 14 halos more massive than 4 X 
1O 14 /i~ 1 M in the 3° X 3° field (solid circles), of which 12 are found. 
There are 15 peaks above a threshold y C ut = 5 X 10 — 5 (dashed 
circles). 



and stars in the ICM, assuming that the mass in hot gas 
is f2t,/f2 m °f the total. Second, each cluster is assumed 
isothermal. Our assumptio ns so far are similar to those 
of Cooray, Hu & Tegmark ( 2000| ), but they additionally 
assumed that all of the gas in the universe was at fixed 
temperature, independent of the mass and virial temper- 
ature of the halo in which it resided. Instead we assign to 
each particle in a group a temperature 
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where E(z) = H(z)/H , A c is the density threshold 
defining the mass and T» gives the overall normalization 
of the relation. In principle one could use a less steep 
function for lower masses, since there is some evidence 
that the T — M relation becomes sh allow er at low mass 
(Finoguenov, Reiprich & Bohringer 2001). However we 
also expect the gas fraction to drop to lower masses, and 
hydrodynamic simulations indicate that we can roughly 
mimic the effect of this on our SZE maps by keeping the 
slope of Eq. ([!]) fixed and holding / gas at its universal 
value. Finally, we smoothly take T — > for halo masses 
below 10 13 /i _1 Mq. How we do this does not influence the 
results. Since most of the SZ emission com es from gas at 
significant overdensities (da Silva et al. 2001; White, Hern- 
quist & Springel 2002[ ) considering only the particles in the 
halos when making the maps is a good approximation. 

We use T* ~ 1 throughout^, which gives good agreement 
for the redshift evolution of the mean mass weighted tem- 
perature and the angular power spectrum of y when com- 
pared to the results of hyd rodynamic simulations (White, 
Hcrnquist & Springel 2002j ) . In particular this method pro- 
vides a better fit to the shape of the simulation based an- 
gular power spectrum, especially over the pe ak, th an semi- 
analytic methods (Komatsu fc Ki tayama 1999] ; Cooray 
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Molnar & Birkinsh aw |2000 ; Holder & Carlstrom 
Komatsu & Seljak [200~2 ). As our approach has 
isothermal clusters, with a deterministic temperature de- 
rived solely from the mass, this somewhat limits the possi- 
ble sources of discrepancy between the semi-analytic and 
hydrodynamic calculations. There is a tendency for the 
N-body results to slightly underpredict (by tens of per- 
cent) the low-£ tail and to have less low level unresolved 
emission than the hydro based maps. Since the low-£ tail 
is sensitive to the volume used in constructing the maps, 
and the hydro simulations were run in smaller boxes, we 
do not regard this disagreement as significant. 

We choose T* so that the power is close to th e level seen 
by CBI in their deep field (Mason et al. p002[ ). This is a 
factor of approximately 4 larger than would be predicted 
by the 'concordance' cosmology. Our results are relatively 
insensitive to the precise value of T* chosen or to the treat- 
ment of 'gas' outside of the virialized regions of halos. 



We generate Compton-?/ maps by integrating for each 

^Tor a power-law spectrum, P(k) ~ k n , the SZE angular power 
spectrum scales as T^ag 4 ' / '^ +n \ Matching the local temperature 
function of rich clusters requires ag ~ T^~"' with 7 ~ 0.7 — 0.9. Thus 
increasing T* much above 1 drastically lowers the predicted SZE 
fluctuations if one maintains agreement with the observed XTF. 
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Fig. 2. — The 3 stages of our mock SZE survey. (Left) the signal (middle) combined with the primary CMB anisotropics, smoothed with 
a 2' FWHM beam and with 5fiK per 2' pixel of gaussian random 'instrument' noise added (the absolute value is plotted); (right) the middle 
map filtered with the matched filter algorithm described in the text to increase the contrast of the clusters against the background (the 
absolute value is plotted). Each map is 3° X 3° and contains 1024 2 pixels, rebinned to 256 2 for plotting purposes. The greyscale in each case 
is logarithmic, with black 2 orders of magnitude below the peak value (white). 
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Here ctt is the Thompson scattering cross section, and n e , 
m e and T e are the electron number density, mass and tem- 
perature respectively. We assume that within the clusters 
the gas is fully ionized. The contribution from each par- 
ticle is distributed over the pixels with a spline weighting 
and a (physical) size equal to the smoothing length of the 
simulation. The temperature fluctuation at frequency v is 
then obtained from the y-maps by 
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where x = hv /UbTcmb — ^/56.84GHz is the dimension- 
less frequency and the second expression is valid in the 
Ray leigh- Jeans limit. In what follows we shall assume the 
low-frequency limit unless otherwise stated. 

While these assumptions are crude, comparison of the 
maps with those made from the more sophisticated hydro- 
dynamic simulations of White, Hcrnquist & Springel ( 2002 ) 
show that they capture many of the features of the more 
detailed modeling. The signals of interest are dominated 
by group and cluster sized halos which are quite regular in 
their gas properties, allowing a relatively simple minded 
treatment for our purposes. It is important to note that 
since we are trying to test rather than calibrate our clus- 
ter extraction methods our requirements on the simulated 
maps are not too stringent. 

We generate 10 random SZE maps, each 3° x 3° with 
1024 2 pixels. An example map produced in this manner is 
shown in Figure [l]. The map is clearly dominated by dis- 
crete sources, the galaxy clusters and rich groups, having 
a typical size of about 1' and a typical amplitude on the 
order of mJy. 

2.4. Primary CMB anisotropies 



The primary CMB anisotropies contribute significantly 
to the SZE signal on the scales of interest to us, and it 
is important to consider the effects of contamination in- 
troduced by such fluctuations unless we can use multi- 
frequency information to suppress the primary anisotropies. 
To investigate this we generate a large CMB map from 
which we extract a region of the same size as the SZE 
map, and add it in as 'noise'. We have used the CMB- 
fast code to generate the CMB power spectrum for our 
cosmology (Seljak & Zaldarriaga 1996). The CMB map 
is then a random realization of a Gaussian field with this 
power spectrum. We generate random phases in momen- 
tum space, and assign amplitudes to each of the fc-modes 
using a distribution whose average value is the amplitude 
in the CMB power spectrum. We have used the flat sky 
approximation, in which the fc-mode in momentum space 
corresponds to £ value in the CMB power spectrum. The 
probability distribution function for the amplitudes, p, of 
each fc-mode is given by 



P(P 2 ) = ^~ p2/Ce 



(5) 



Thus if e is a random number between and 1, p 2 will be 
given by 

p 2 = -C e ln(e) (6) 

2.5. Matched filter 

After adding the CMB into the map, we convolve with a 
Gaussian beam, which smears the signal and obscures and 
information on scales much smaller than the beam size. 
To complete the mock observation, we add a few pK of 
white noise to each pixel corresponding to expected levels 
of noise from the electronics used in real observations. This 
introduces fluctuations on scales much smaller than the 
size of the features in the beam smeared map. 

In order to remove the CMB anisotropies, and smooth 
the small scale noise, it is necessary to filter the map; we 
examine a matched filter algorithm to optimize the con- 
trast of signal to noise. The specific filter is described 
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in (Tegmark & de O livier a-Costa 1998 , Eq. 6; see also 
Haehnelt & Tegmark 1996). It is azimuthally symmetric 
and lias a radial dependence in Fourier space of 
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Here Bi = e - e2e ( e + 1 )/ 2 an( i Cj OT refers to the sum of the 
power spectra of elements to be removed from the map, in 
our case the CMB and white noise. The noise power spec- 
trum is giv en by Cf — (<r p i x 9 p i x )~ 2 BJ 2 (e.g. Tegmark & 
Efstathiou 1996). For the normalization of the filter, we 
use the la value of the noise (only) map after it has been 
filtered with We = 1/ BgCf OT . In this way, we can com- 
pare mock observations with different beam sizes and levels 
of noise in terms of the statistical significance of the cluster 
SZE signal above the white noise in the map. Note that 
in some cases the instrumental noise level will be signifi- 
cantly below the 'noise' induced from CMB anisotropics, 
so amplitudes in these maps can be many 'noise sigma'. 

The matched filter formally maximizes the efficiency 
with which the SZE survey will locate isolated clusters 
because it maximizes the contrast between the signal and 
the background noise. In a real cluster survey, however, 
the analysis team may wish to sacrifice efficiency some- 
what if in doing so it can provide a substantial gain in the 
completeness of the survey. In many cases, particularly 
when the beam size is large, the filter in Fourier space is 
very narrow, and filters out much of the signal along with 
the noise. In addition, a narrow filter in Fourier space de- 
velops significant side lobes in real space. When such a 
filter is convolved with the density spike at the location 
of the cluster, it enhances both the central peak and an 
annular region at the location of each side lobe. In the 
most extreme cases, several concentric rings may appear 
around the central maximum that marks the location of 
the cluster. These rings can overlap with other rings from 
nearby structures in complicated ways. If these rings are 
misidcntified as separate clusters, they negatively impact 
the efficiencyFI. Thus making the filter a little wider than 
the matched filter for experiments with lower resolution 
helps by improving the completeness significantly while it 
ameliorates the ringing. 

2.6. Peak finding 

Even a glance at Figures |l| and || is enough to show 
that finding peaks in an SZE observation is fraught with 
difficulties. The peaks are often irregularly shaped, con- 
tain significant substructure and merge with neighboring 
peaks. We spent considerable time trying different meth- 
ods of detecting substructure, merging peaks and imposing 
thresholds on either total flux or peak flux. We found that 
exactly which peaks passed which cuts depended on how 
these issues were handled, but were unable to find a strat- 
egy which worked in every case. Smoothing the maps with 
a resolution of 1' or more, matched to the angular size of 

2 Such considerations will obviously apply to any filtering tech- 
nique with a compensated filter, such as for example a wavelet 
based approach. The key parameter is the mean separation of bright 
sources compared to the filtering scale. If the separation is smaller 
than the filtering scale, the compensated filter needs to be used with 
caution. 



clusters at cosmological distances, mitigated some of the 
sensitivity but did not entirely eliminate the dependence 
on peak finding properties. In particular, which peaks arc 
found in dense environments (which may for example cor- 
respond to interacting halos) depends on ones choice for 
peak edges and merger criteria. This is an analogue of the 
problem we found in defining clusters in the 3D dataset 
under similar conditions. 

However we did notice that while the exact numbers 
were sensitive to the peak finder, the general trends we 
find (below) were not very sensitive. For this reason we 
chose the simplest peak finder, with the fewest adjustable 
parameters, in order to present our results. Specifically 
we used a simple algorithm, simila r in na ture to the one in 
White, van Waerbeke & Mackey (2002). We first record 
and number all the pixel locations of local maxima in the 
map. We search around each local maximum and include 
in the extended peak all pixels with a value greater than 
25% of the maximum value, out to a maximum radius of 
10 pixels. All peaks are extended at the same rate, so that 
adjacent peaks do not merge into one object. This has a 
tendency to split objects with significant substructure into 
multiple peaks, but for smoothed maps such a situation is 
reasonably rare. The algorithm returns a peak number 
(or no peak) for every pixel in the map. We have tested 
sensitivity to the 25% criterion for associating a pixel with 
a peak and find that, for the smoothed maps, complete- 
ness and efficiency are unaffected by moderate changes in 
this parameter. The 'value' associated with a peak is the 
peak temperature fluctuation. Since we are smoothing on 
scales comparable to the total size of a peak it makes little 
difference whether we choose peak temperature or total 
fluctuation. The fraction of peaks in the mock observa- 
tion that matched at least one halo is the efficiency, and 
the fraction of halos that matched a peak is the complete- 
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Fig. 3. — The role of sensitivity and resolution in SZE surveys. 
The three types of curves, from top to bottom at left of plot) are 
the power spectra of the primary CMB anisotropics (thick solid), 
the SZE signal (an average of the power spectra in our ten maps; 
solid+dashed band), and the instrument noise (dotted, dot-dashed 
and dashed). The dotted curve has a 3' FWHM Gaussian beam 
with 4/iK of noise per 1'. The dot-dashed curve has the same beam 
size, but 2/iK noise per 1'. The dashed curve has a 1' beam with 
2(j,K of noise. 
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Certainly more complicated peak finding methods could 
be employed, a different filter could be tried (e.g. a wave let 
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based method such as described in Cayon et al. 
Herranz et al. 2002, which corresponds to bandpass fil- 



tering) and more sophisticated modeli ng (e .g. alo ng the 
lines of the CLEAN algorithm: Hogbom |f974| ; Clark p.980[ ) 
could remove some of the artifacts introduced by the fil- 
tering. Because of this we feel that improved strategies 
for finding peaks in such maps is an area worthy of more 
attention. 

3. RESULTS 

We find that the number of clusters that can be recov- 
ered from the mock SZE survey is a strong function of the 
beam size, the level of noise, and the threshold at which 
we identify cluster candidates. The level of contamination 
of the cluster candidates also depends strongly on these 
quantities. In general, decreasing the beam size, decreas- 
ing the noise, and decreasing the threshold for identifying 
candidates in the SZE map will improve completeness at 
the expense of efficiency. Since efficiency is not the exact 
inverse of completeness, some combinations of these three 
parameters will yield better results than others. We also 
find that multi-frequency information can be extremely 
valuable in finding clusters, by removing the dominant 
noise source at large angular scales. 

Figure |l| shows a 'perfect' observation by a multi- 
frequency instrument. The Compton-y map has been 
smoothed at 1', but no noise or CMB has been added. The 
grey-scale has been selected to emphasize the most promi- 
nent peaks. We have indicated the peaks corresponding to 
the most massive clusters in the field by solid circles, and 
the biggest peaks in the field by dashed circles. As one 
can see most of the massive clusters are easily recovered 
with few false positives, but there is not a 1-1 correspon- 
dence even at these high thresholds. The source of the 
disagreement can be traced to the large number of ob- 
jects in the maps (confusion) and a scatter in the relation 
between the mass and the observable SZE (White, Hern- 
quist & Springel |2002 ). The sources of this scatter are 
threefold: First, there is a time evolution in the relation 
between the mass of a cluster and its temperature, Eq. (p]). 
Since the clusters are from a large range in redshifts, this 
causes some scatter. Secondly, projection effects are non- 
negligible. In fact, clusters have a tendency to form in 
over-dense regions, often at the intersection of a beaded 
filamentary structure, increasing the probability of non- 
trivial line-of-sight projection. Finally, clusters are not 
spherical and the signal strength depends on their orien- 
tation. Since the lower mass halos are intrinsically more 
numerous, even misidentifying a small fraction of them can 
negatively impact the survey efficiency. 

Of course the situation depicted in Figure |l| is highly 
unrealistic. To truly asses how well a survey can find clus- 
ters we need to include both astrophysical and instrument 
noise. The images in Figure display the various stages 
of our mock SZE survey. The pure SZE map is shown 
in panel one. Panel two shows how the SZE signal from 
clusters is clouded by adding the CMB, convolving with 



a Gaussian beam of 1' angular extent, and further adding 
Gaussian random noise. Panel 3 shows the map in panel 
two after it has been filtered with the matched filter de- 
scribed above. 

Figure || illustrates the role of the size of the beam and 
the noise level in locating clusters with an SZE survey. The 
three types of curves are the power spectra of the primary 
CMB anisotropics, the SZE signal (an average of the power 
spectra in our ten maps), and the instrument noise. (In 
the case of multi-frequency observations the primary CMB 
signal could be eliminated or at least strongly suppressed.) 
The range of scales over which an experiment is sensitive 
to the SZE are those scales for which the SZE signal is not 
overpowered by either the CMB or the instrument noise. 
The dashed lines above and below the SZE power spec- 
trum are meant to remind the reader that the normaliza- 
tion of the power spectrum depends non-trivially on the 
cosmology we have simulated, particularly on the value of 
(78, and could change by a factor of 2 or 3 with different 
initial assumptions. In circumstances where the window 
of SZE sensitivity is narrow, such a change profoundly af- 
fects the projected yield of an experiment. The three noise 
curves are plots of Cf , which depends on the level of noise 
and on the beam size. For clarity, all noise levels reported 
are in /iK per 1' pixel, regardless of the beam size. The 
dotted curve has a 3' Gaussian beam at full width half 
maximum, with A[aK of noise. The dot-dashed curve has 
the same beam size, but the level of noise has been reduced 
to 2/iK. It is clear that reducing the level of noise has a 
relatively small impact on the range of scales that can be 
probed. In contrast, the dashed curve which has a beam 
size of 1' with 2/iif of noise, demonstrates that decreasing 
the beam size has a dramatic effect on the range of sen- 
sitivity. Alternatively, using multi-frequency observations 
to reduce the contribution of the primary anisotropies can 
allow one to work at lower I. 

3.1. Multi-frequency observations 
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Fig. 4. — Completeness in finding halos more massive than 3 X 
10 14 h~ 1 M<r ) in maps with noise 5/iK per 1' pixel with peaks selected 
as local maxima. 
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We begin by examining maps which do not include the 
contribution from the CMB. Such maps could be obtained 
using multi-frequency observations (we will not address 
here the methods by which component separation is done, 
for a re cent survey of t he lit erature see e.g. Vielva et 
al. 2001 or Herranz et al. 2002) of the same field. This re- 
moves the largest source of confusion from the SZE maps. 
To generate these maps we converted the Compton-y maps 
to temperature fluctuation maps, smoothed them with a 
Gaussian filter using an FFT, added an appropriate level 
of pixel (white) noise and once again smoothed the maps. 
This last stage of smoothing was necessary since our pixel 
scale is significantly smaller than the beam, leading to a 
large per pixel noise. Smoothing the maps reduces this 
noise with little effect on the signal. 

We find that for noise levels as low as 5/zK per 1' pixel 
both the completeness and efficiency can be very good at 
high angular resolution. For thresholds above 5<r more 
than 80% of the peaks correspond to rich clusters above 
3 x W^h^M®, and for beams of 1' FWHM such a cut re- 
covers 75% of the existing clusters above this mass thresh- 
old. As long as a cut of at least 4<r is applied the efficiency 
is greater than 60%. For a beam smaller than 2' FWHM 
more than half of the clusters can be recovered. However 
such a low noise level may be optimistic for upcoming ex- 
periments. If we increase the noise to 10/iK per V pixel the 
situation degrades. The best completeness is now around 
60% and only at low thresholds, for which the efficiencies 
are low. To avoid confusion the beam also needs to be less 
than 1.5' FWHM. 

As a concrete example we consider an idealization of the 
Planck satellite surveying ~ 70% of the sky. In terms of 
cluster finding, Planck's capabilities are primarily limited 
by its resolution, many distant clusters subtending a much 
smaller angle that Planck's beam size. As such, the sam- 
ple found by Planck will be biased in that it will tend to 
identify the nearest clusters (see Aghanim et al. |1997 or 
Kay, Liddle & Thomas 2001 for more details and Herranz 
et al. 2002 for a recent study of the expected Planck SZE 
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catalog). We assume that component separation has left 
no primary CMB signal in our SZE map. In particular, we 
optimistically computed the completeness and efficiency 
with equal parts noise from the 217 GHz (no SZE) and 
353 GHz channels, added in quadrature. These channels 
both have an angular resolution of 5'. Within this (overly) 
simplistic approximation Planck finds close to half of the 
objects more massive than 3 x 10 14 /i _1 M , but with our 
simple peak finding algorithm the efficiency is below 20%. 
Over 70% of the sky this is a very large sample, useful for 
many studies of clusters. However a more sophisticated 
cluster finding algorithm would need to be employed be- 
fore this sample could be used as a cosmological probe. 
Improved methods for finding massive clusters in low res- 
olution but multi-frequency data is a topic which we intend 
to pursue further in a future publication. 

3.2. Spatial filtering 

Let us now turn to the case of single frequency maps. 
In this situation we must remove the large CMB contri- 
bution by using its spatial structure. Figures gand | are 
contour plots of the completeness and efficiency that can 
be achieved by a single frequency experiment, using the 
optimal filter of Eq. (R), for various beam sizes. Again we 
consider all rich clusters above 3 x lO 14 /)." 1 M Q . The con- 
tours are computed at a constant noise level of 5/iK per 1' 
pixel. On the y axis is the threshold used on the filtered 
plot to identify cluster candidates. The threshold indicates 
the statistical significance of the cluster candidate above 
the level of instrument noise. The lcr value is determined 
by filtering the noise map without the SZE or CMB si 
nal, and computing the resulting variance. In Figure 
the completeness improves as the resolution gets better 
(smaller beam), however some percentage of the clusters 
are overlooked as the candidate threshold is raised. Rais- 
ing the threshold is useful however in terms of improving 
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Fig. 5. — Efficiency at finding halos more massive than 3 X 
10 14 H^ 1 Mq in maps with noise 5(iK per 1' pixel with peaks se- 
lected as local maxima. 



Fig. 6. — The fraction of the total number of clusters in each 
mass bin of width 1.75 X 10 13 /i — 1 Mq found in mock surveys of 
three different beam sizes: 1', 2.5' and 4' (top to bottom). The 
solid curves indicate the results when the matched filter is used, the 
dashed curves show the improvement in completeness when a less 
severe filter is employed. Slope in the curves indicates a selection 
bias. 
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the efficiency of the survey, as can be seen in Figure g[ 
Because all of the candidates need to be followed up for 
redshift information (and positive identification as a real 
cluster, since the contamination for some experiments can 
be large), efficiency is a high priority. Since the contours 
in Figures || and || are not completely parallel, there is not 
a simple trade off between completeness and efficiency, but 
rather there are regions that are clearly somewhat better 
compromises than others. 

The very low effiencies evident in the bottom left of 
Figure [5] should not be read as confusion by noise in the 
map. There are no noise peaks of 10<r in these maps, 
even including the CMB! Rather they come from lower 
mass halos which happen to give rise to peaks which are 
many sigma above the noise when the beam is small. For 
a l'FWHM beam at 5fjK even halos with 10 14 /i _1 M 
can give rise to an > 5c peak, and these outnumber the 
M > 3 x 10 14 /i _1 M Q halos 10:1. Raising the threshold 
even further (off the top of the plot) removes the lower 
mass peaks and improves efficiencies, but at the expense 
of completeness as in the above examples. 

3.3. Non-optimal filter 

Another degree of freedom is to change the shape of the 
filter instead of adjusting the threshold. As mentioned ear- 
lier, broadening the filter can improve completeness with- 
out clobbering the efficiency because the increase in noise 
confusion is moderately compensated by the decrease in 
ringing. Broadening the filter can also decrease both the 
mass and distance biases that occur in SZE surveys with 
limited angular resolution. Figures^ and]?] show the com- 
pleteness for three different beam sizes, binned in mass 
and distance bins respectively. These plots demonstrate 




Distance to Cluster 



Fig. 7. — The fraction of the total number of clusters in each dis- 
tance bin of width 235h~ x Mpc found in mock surveys of three dif- 
ferent beam sizes: 1', 2.5' and 4' (top to bottom). The solid curves 
indicate the results when the matched filter is used, the dashed 
curves show the improvement in completeness when a less severe 
filter is employed. Slope in the curves indicates a selection bias. 
In this cosmology z = 1 is at (approximately) 2300 /i -1 Mpc. The 
distribution of halos more massive than 3 X 10 14 h~ 1 Mq peaks at 
z ~ 0.6 or 1400 /i~ 1 Mpc and falls by two orders of magnitude by 
2 = 2 (3600/i- 1 Mpc). 



that while high resolution experiments are relatively unbi- 
ased, surveys with a large beam size will tend to deselect 
distant or less massive clusters because the angle they sub- 
tend is smaller than the resolution of the experiment. By 
broadening the filter (dashed lines in Figs. || and 0), the 
completeness of the sample improves, as does the bias in 
some cases. Decreasing the bias, or at very least quanti- 
fying it, is critical if the survey is to be used to constrain 
the cosmological parameters. 

We find that broadening the filter on the high I side 
(noise side) in Fourier space is ineffective because the in- 
crease in the noise confusion is too great, completely de- 
stroying the efficiency. Instead we have filtered less harshly 
on the low £ portion of the mock survey, replacing the 
1 jCg MB portion of the filter with a much wider half Gaus- 
sian that peaks at the same value of I. The improvement 
in the completeness is naturally accompanied by a corre- 
sponding decrease in the efficiency. The efficiencies for 4', 
2. 5', and 1' beam sizes in the matched filter case were 93%, 
86%, and 52% respectively. In the modified filters, the 
width of the Gaussian was chosen so that the efficiency 
would be approximately 45%. These widths were 1600, 
2500, and 2500 £ values respectively. This method could 
be used to satisfy any efficiency requirement lower than 
the efficiency of the matched filter, in order to improve 
completeness. While we did not explicitly try it we expect 
that the mexican hat wavelet, which corresponds to a filter 
x 2 e~ x with x oc £, would also work well. 

4. CONCLUSIONS 

The SZE offers a new and potentially very powerful 
method for finding high redshift clusters of galaxies. Be- 
cause the amplitude of the effect is independent of the 
distance to the cluster, it appears to be one of the best 
techniques for constructing a large sample of high-z clus- 
ters. 

Finding clusters with an SZE survey is however fraught 
with complications, some of which we have begun to ad- 
dress here using mock observations of simulated maps. 
While there remains significant uncertainty in the overall 
level of the SZE angular power spectrum and our model- 
ing of the effect has been somewhat crude, we already see 
that the requirements on frequency coverage, angular res- 
olution and noise are quite severe for experiments hoping 
to find large samples of clusters through the SZE. In single 
frequency maps, the primary CMB anisotropics prove to 
be a large contaminant. Indeed for experiments with an- 
gular resolutions of > 1' there is little spatial range where 
the SZE signal dominates. 

Many important effects must be balanced when design- 
ing the experiment and analyzing the data. The angular 
resolution of the instrument used is of paramount impor- 
tance, a key element in determining the yield of the survey. 
Decreasing the beam size improves the completeness, and 
more importantly decreases the bias against distant and 
lower mass clusters. For any given resolution, however, 
there are adjustable parameters in the data analysis that 
can help reduce the bias and maximize completeness, at 
the moderate expense of efficiency. Efficiency is still a vital 
part of the survey design, however, because each cluster 



candidate must be followed up for positive identification 
and redshift information. The intrinsic contamination that 
occurs as a result of projection effects makes a follow up 
required, even if the precise redshift is not needed. To 
obtain a high level of completeness with correspondingly 
high efficiency requires a multi-frequency observation with 
angular resolution of 1' or better, and noise at or below 
10£iK per 1' pixel. 

We have found that aggressive spatial filtering, to en- 
hance the clusters against the background, can have the 
unwanted side effect of introducing ringing into the maps. 
Given the large number of sources in a typical simulated 
map, overlapping 'rings' can produce significant false de- 
tections. Multi-frequency information could help reduce 
some of the pitfalls inherent in a single frequency analysis, 
and may provide a less severe alternative than matched 
filtering to disentangle SZE clusters from primary CMB 
anisotropies. For any of these approaches, more sophis- 
ticated methods of identifying peaks in the SZE map 
(e.g. matched filtering) need to be investigated. Better 
data analysis offers the hope of increased completeness 
without a sacrifice in efficiency. 
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